TagBook: A Semantic Video Representation Without Supervision for Event Detection

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Road Detection and Semantic Segmentation without Strong Human Supervision

Recently, convolutional neural networks (CNNs) trained with strong human supervision have shown to achieve state of the art performance for both road detection and semantic segmentation. However, collecting strongly labeled data for both require detailed per-pixel annotations from humans which renders data annotation highly costly and time consuming. Therefore, in this work we propose methods t...

متن کامل

Expressing Implicit Semantic Relations without Supervision

We present an unsupervised learning algorithm that mines large text corpora for patterns that express implicit semantic relations. For a given input word pair Y X : with some unspecified semantic relations, the corresponding output list of patterns m P P , , 1 is ranked according to how well each pattern i P expresses the relations between X and Y . For example, given ostrich = X and bird = Y ,...

متن کامل

Semantic Event Detection in Structured Video Using Hybrid HMM/SVM

In this paper, we propose a new semantic event detection algorithm in structured video. A hybrid method that combines HMM with SVM to detect semantic events in video is proposed. The proposed detection method has some advantages that it is suitable to the temporal structure of event thanks to Hidden Markov Models (HMM) and guarantees high classification accuracy thanks to Support Vector Machine...

متن کامل

Learning Rules for Semantic Video Event Annotation

Automatic semantic annotation of video events has received a large attention from the scientific community in the latest years, since event recognition is an important task in many applications. Events can be defined by spatio-temporal relations and properties of objects and entities, that change over time; some events can be described by a set of patterns. In this paper we present a framework ...

متن کامل

Cross-Modal Supervision for Learning Active Speaker Detection in Video

In this paper, we show how to use audio to supervise the learning of active speaker detection in video. Voice Activity Detection (VAD) guides the learning of the vision-based classifier in a weakly supervised manner. The classifier uses spatio-temporal features to encode upper body motion facial expressions and gesticulations associated with speaking. We further improve a generic model for acti...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Multimedia

سال: 2016

ISSN: 1520-9210,1941-0077

DOI: 10.1109/tmm.2016.2559947